1 Introduction

用于NGS测序文件处理,突变相关分析和基因表达分析、可视化操作等

2 Generating MAF files

  • 对于生信拉的基因突变数据文件,可以用mutToMAF函数转化成MAF文件。或者其它途径得到的MAF格式文件

  • If you’re using ANNOVAR for variant annotations, maftools has a handy function annovarToMaf for converting tabular annovar outputs to MAF.

3 MAF field requirements

MAF files contain many fields ranging from chromosome names to cosmic annotations. However most of the analysis in maftools uses following fields.

  • Mandatory fields: Hugo_Symbol, Chromosome, Start_Position, End_Position, Reference_Allele, Tumor_Seq_Allele2, Variant_Classification, Variant_Type and Tumor_Sample_Barcode.

  • Recommended optional fields: non MAF specific fields containing VAF (Variant Allele Frequency) and amino acid change information.

Complete specification of MAF files can be found on NCI GDC documentation page.

This vignette demonstrates the usage and application of Mypackage on an example MAF file from Yunying cohort 1-5 and TCGA CRC cohort 6.

4 Installation

  • 目前R包已经上传网络,有需要可以向作者索取安装包或者云安装,安装如下所示.
install.packages("./Mypackage_1.3.1.tar.gz", repos = NULL, type = "source")
remotes::install_git("https://gitee.com/YPM2022/mypackage")

5 Overview of the package

Mypackage是集突变分析和基因表达分析于一体的综合性R包,包括MAF格式文件生成,通路突变差异分析,基因表达差异分析和单细胞分析以及相关的可视化.

5.1 Creat MAF files.

’mutToMAF’函数可以将生信拉取的NGS测序文件,例如gzy#6Q_g_639_review_for_report.xls,转换成MAF格式的文件

library(Mypackage)
#example
data("clindata")#需要整合的样本数据
root_dir <-system.file("example",  package = "Mypackage")#example数据
MAF <-mutToMAF(root_dir=root_dir,clin=clindata,saveDATA=FALSE,mut_filter=TRUE,
                  tumor_t=10,site_depth=100,hotspot_vaf=0.009,
                 non_hotspot_vaf=0.045,hotspotloss_vaf=0.095,non_hotspotloss_vaf=0.195)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## The following objects are masked from 'package:base':
## 
##     format.pval, units
##  Loop gzy#1007Q_g_639_review_for_report.xls time: 0.0501861572265625 s 
##  
##  Loop gzy#101Q_g_639_review_for_report.xls time: 0.0321450233459473 s 
##  
##  Loop gzy#1022Q_g_639_review_for_report.xls time: 0.0348548889160156 s 
##  
##  Loop gzy#1028Q_g_639_review_for_report.xls time: 0.0384509563446045 s 
##  
##  Loop gzy#1039Q_g_639_review_for_report.xls time: 0.0318408012390137 s 
##  
##  Loop gzy#1049Q_g_639_review_for_report.xls time: 0.03857421875 s 
##  
##  Loop gzy#1054Q_g_639_review_for_report.xls time: 0.0349640846252441 s 
##  
##  Loop gzy#1060Q_g_639_review_for_report.xls time: 0.0409340858459473 s 
##  
##  Loop gzy#1067Q_g_639_review_for_report.xls time: 0.0382578372955322 s 
##  
##  Loop gzy#1072Q_g_639_review_for_report.xls time: 0.0408477783203125 s 
##  
##  Loop gzy#1105Q_g_639_review_for_report.xls time: 0.0295600891113281 s 
##  
##  Loop gzy#110Q_g_639_review_for_report.xls time: 0.032588005065918 s 
##  
##  Loop gzy#1116Q_g_639_review_for_report.xls time: 0.0348560810089111 s 
##  
##  Loop gzy#1117Q_g_639_review_for_report.xls time: 0.0526669025421143 s 
##  
##  Loop gzy#1123Q_g_639_review_for_report.xls time: 0.0380508899688721 s 
##  
##  Loop gzy#1128Q_g_639_review_for_report.xls time: 0.0453240871429443 s 
##  
##  Loop gzy#1139Q_g_639_review_for_report.xls time: 0.0338740348815918 s 
##  
##  Loop gzy#114Q_g_639_review_for_report.xls time: 0.0263550281524658 s 
##  
##  Loop gzy#1153Q_g_639_review_for_report.xls time: 0.0348250865936279 s 
##  
##  Loop gzy#1161Q_g_639_review_for_report.xls time: 0.0408530235290527 s 
##  
##  Loop gzy#1171Q_g_639_review_for_report.xls time: 0.0442790985107422 s 
##  
##  Loop gzy#1177Q_g_639_review_for_report.xls time: 0.0384631156921387 s 
##  
##  Loop gzy#1184Q_g_639_review_for_report.xls time: 0.0327038764953613 s 
##  
##  Loop gzy#1196Q_g_639_review_for_report.xls time: 0.0344798564910889 s 
##  
##  Loop gzy#1197Q_g_639_review_for_report.xls time: 0.039280891418457 s 
##  
##  Loop gzy#1199Q_g_639_review_for_report.xls time: 0.0367481708526611 s 
##  
##  Loop gzy#1212Q_g_639_review_for_report.xls time: 0.0361571311950684 s 
##  
##  Loop gzy#1221Q_g_639_review_for_report.xls time: 0.040241003036499 s 
##  
##  Loop gzy#1243Q_g_639_review_for_report.xls time: 0.0416638851165771 s 
##  
##  Loop gzy#1247Q_g_639_review_for_report.xls time: 0.038640022277832 s 
##  
##  Loop gzy#1258Q_g_639_review_for_report.xls time: 0.0365221500396729 s 
##  
##  Loop gzy#1280Q_g_639_review_for_report.xls time: 0.0260610580444336 s 
##  
##  Loop gzy#1281Q_g_639_review_for_report.xls time: 0.0413579940795898 s 
##  
##  Loop gzy#128Q_g_639_review_for_report.xls time: 0.0221409797668457 s 
##  
##  Loop gzy#1296Q_g_639_review_for_report.xls time: 0.0277969837188721 s 
##  
##  Loop gzy#129Q_g_639_review_for_report.xls time: 0.0263669490814209 s 
##  
##  Loop gzy#1306Q_g_639_review_for_report.xls time: 0.0307548046112061 s 
##  
##  Loop gzy#1331Q_g_639_review_for_report.xls time: 0.0383780002593994 s 
##  
##  Loop gzy#1333Q_g_639_review_for_report.xls time: 0.0341758728027344 s 
##  
##  Loop gzy#1341Q_g_639_review_for_report.xls time: 0.0340240001678467 s 
##  
##  Loop gzy#1362Q_g_639_review_for_report.xls time: 0.0349102020263672 s 
##  
##  Loop gzy#1363Q_g_639_review_for_report.xls time: 0.0349380970001221 s 
##  
##  Loop gzy#1375Q_g_639_review_for_report.xls time: 0.0296170711517334 s 
##  
##  Loop gzy#1384Q_g_639_review_for_report.xls time: 0.0336718559265137 s 
##  
##  Loop gzy#1441Q_g_639_review_for_report.xls time: 0.0400218963623047 s 
##  
##  Loop gzy#1468Q_g_639_review_for_report.xls time: 0.0426368713378906 s 
##  
##  Loop gzy#1474Q_g_639_review_for_report.xls time: 0.0371661186218262 s 
##  
##  Loop gzy#149Q_g_639_review_for_report.xls time: 0.0161919593811035 s 
##  
##  Loop gzy#1502Q_g_639_review_for_report.xls time: 0.0454778671264648 s 
##  
##  Loop gzy#154Q_g_639_review_for_report.xls time: 0.0188789367675781 s 
##  
##  Loop gzy#1550Q_g_639_review_for_report.xls time: 0.0363280773162842 s 
##  
##  Loop gzy#1579Q_g_639_review_for_report.xls time: 0.0423340797424316 s 
##  
##  Loop gzy#1582Q_g_639_review_for_report.xls time: 0.0424480438232422 s 
##  
##  Loop gzy#1586Q_g_639_review_for_report.xls time: 0.0304620265960693 s 
##  
##  Loop gzy#1595Q_g_639_review_for_report.xls time: 0.0406959056854248 s 
##  
##  Loop gzy#1611Q_g_639_review_for_report.xls time: 0.0349469184875488 s 
##  
##  Loop gzy#1620Q_g_639_review_for_report.xls time: 0.0386369228363037 s 
##  
##  Loop gzy#1632Q_g_639_review_for_report.xls time: 0.0412440299987793 s 
##  
##  Loop gzy#1638Q_g_639_review_for_report.xls time: 0.038316011428833 s 
##  
##  Loop gzy#1646Q_g_639_review_for_report.xls time: 0.036395788192749 s 
##  
##  Loop gzy#1647Q_g_639_review_for_report.xls time: 0.0370559692382812 s 
##  
##  Loop gzy#1656Q_g_639_review_for_report.xls time: 0.0474820137023926 s 
##  
##  Loop gzy#1660Q_g_639_review_for_report.xls time: 0.0334079265594482 s 
##  
##  Loop gzy#1666Q_g_639_review_for_report.xls time: 0.0550510883331299 s 
##  
##  Loop gzy#1673Q_g_639_review_for_report.xls time: 0.0352799892425537 s 
##  
##  Loop gzy#1683Q_g_639_review_for_report.xls time: 0.0245800018310547 s 
##  
##  Loop gzy#1687Q_g_639_review_for_report.xls time: 0.0535769462585449 s 
##  
##  Loop gzy#1688Q_g_639_review_for_report.xls time: 0.0336160659790039 s 
##  
##  Loop gzy#16Q_g_639_review_for_report.xls time: 0.0208749771118164 s 
##  
##  Loop gzy#1705Q_g_639_review_for_report.xls time: 0.0407700538635254 s 
##  
##  Loop gzy#1707Q_g_639_review_for_report.xls time: 0.0365431308746338 s 
##  
##  Loop gzy#170Q_g_639_review_for_report.xls time: 0.0210089683532715 s 
##  
##  Loop gzy#1713Q_g_639_review_for_report.xls time: 0.0403051376342773 s 
##  
##  Loop gzy#1726Q_g_639_review_for_report.xls time: 0.0393848419189453 s 
##  
##  Loop gzy#173Q_g_639_review_for_report.xls time: 0.0219800472259521 s 
##  
##  Loop gzy#1748Q_g_639_review_for_report.xls time: 0.0543339252471924 s 
##  
##  Loop gzy#1749Q_g_639_review_for_report.xls time: 0.0439789295196533 s 
##  
##  Loop gzy#175Q_g_639_review_for_report.xls time: 0.0255758762359619 s 
##  
##  Loop gzy#1760Q_g_639_review_for_report.xls time: 0.0339739322662354 s 
##  
##  Loop gzy#192Q_g_639_review_for_report.xls time: 0.0292568206787109 s 
##  
##  Loop gzy#196Q_g_639_review_for_report.xls time: 0.0250132083892822 s 
##  
##  Loop gzy#208Q_g_639_review_for_report.xls time: 0.0266220569610596 s 
##  
##  Loop gzy#214Q_g_639_review_for_report.xls time: 0.0265560150146484 s 
##  
##  Loop gzy#229Q_g_639_review_for_report.xls time: 0.0385329723358154 s 
##  
##  Loop gzy#235Q_g_639_review_for_report.xls time: 0.0277061462402344 s 
##  
##  Loop gzy#239Q_g_639_review_for_report.xls time: 0.0332720279693604 s 
##  
##  Loop gzy#247Q_g_639_review_for_report.xls time: 0.0239999294281006 s 
##  
##  Loop gzy#248Q_g_639_review_for_report.xls time: 0.035560131072998 s 
##  
##  Loop gzy#255Q_g_639_review_for_report.xls time: 0.0225470066070557 s 
##  
##  Loop gzy#263Q_g_639_review_for_report.xls time: 0.0195901393890381 s 
##  
##  Loop gzy#265Q_g_639_review_for_report.xls time: 0.0339229106903076 s 
##  
##  Loop gzy#272Q_g_639_review_for_report.xls time: 0.0229549407958984 s 
##  
##  Loop gzy#277Q_g_639_review_for_report.xls time: 0.0295431613922119 s 
##  
##  Loop gzy#280Q_g_639_review_for_report.xls time: 0.0249881744384766 s 
##  
##  Loop gzy#281Q_g_639_review_for_report.xls time: 0.0264627933502197 s 
##  
##  Loop gzy#284Q_g_639_review_for_report.xls time: 0.0471351146697998 s 
##  
##  Loop gzy#313Q_g_639_review_for_report.xls time: 0.0315940380096436 s 
##  
##  Loop gzy#328Q_g_639_review_for_report.xls time: 0.0438308715820312 s 
##  
##  Loop gzy#32Q_g_639_review_for_report.xls time: 0.0304539203643799 s 
##  
##  Loop gzy#338Q_g_639_review_for_report.xls time: 0.041295051574707 s 
##  
##  Loop gzy#347Q_g_639_review_for_report.xls time: 0.0556631088256836 s 
##  
##  Loop gzy#34Q_g_639_review_for_report.xls time: 0.0207479000091553 s 
##  
##  Loop gzy#355Q_g_639_review_for_report.xls time: 0.0414969921112061 s 
##  
##  Loop gzy#36Q_g_639_review_for_report.xls time: 0.0234990119934082 s 
##  
##  Loop gzy#376Q_g_639_review_for_report.xls time: 0.0312221050262451 s 
##  
##  Loop gzy#383Q_g_639_review_for_report.xls time: 0.0382370948791504 s 
##  
##  Loop gzy#401Q_g_639_review_for_report.xls time: 0.0210621356964111 s 
##  
##  Loop gzy#405Q_g_639_review_for_report.xls time: 0.0314369201660156 s 
##  
##  Loop gzy#407Q_g_639_review_for_report.xls time: 0.0208220481872559 s 
##  
##  Loop gzy#415Q_g_639_review_for_report.xls time: 0.0250520706176758 s 
##  
##  Loop gzy#427Q_g_639_review_for_report.xls time: 0.0229480266571045 s 
##  
##  Loop gzy#443Q_g_639_review_for_report.xls time: 0.0449631214141846 s 
##  
##  Loop gzy#448Q_g_639_review_for_report.xls time: 0.0385940074920654 s 
##  
##  Loop gzy#450Q_g_639_review_for_report.xls time: 0.0339028835296631 s 
##  
##  Loop gzy#453Q_g_639_review_for_report.xls time: 0.030238151550293 s 
##  
##  Loop gzy#455Q_g_639_review_for_report.xls time: 0.0485379695892334 s 
##  
##  Loop gzy#456Q_g_639_review_for_report.xls time: 0.0336101055145264 s 
##  
##  Loop gzy#458Q_g_639_review_for_report.xls time: 0.0259850025177002 s 
##  
##  Loop gzy#46Q_g_639_review_for_report.xls time: 0.0226130485534668 s 
##  
##  Loop gzy#471Q_g_639_review_for_report.xls time: 0.0276699066162109 s 
##  
##  Loop gzy#476Q_g_639_review_for_report.xls time: 0.0299739837646484 s 
##  
##  Loop gzy#477Q_g_639_review_for_report.xls time: 0.032494068145752 s 
##  
##  Loop gzy#495Q_g_639_review_for_report.xls time: 0.0237941741943359 s 
##  
##  Loop gzy#509Q_g_639_review_for_report.xls time: 0.0247330665588379 s 
##  
##  Loop gzy#519Q_g_639_review_for_report.xls time: 0.0400588512420654 s 
##  
##  Loop gzy#520Q_g_639_review_for_report.xls time: 0.0308849811553955 s 
##  
##  Loop gzy#527Q_g_639_review_for_report.xls time: 0.0257048606872559 s 
##  
##  Loop gzy#530Q_g_639_review_for_report.xls time: 0.0350780487060547 s 
##  
##  Loop gzy#535Q_g_639_review_for_report.xls time: 0.0259649753570557 s 
##  
##  Loop gzy#549Q_g_639_review_for_report.xls time: 0.0313842296600342 s 
##  
##  Loop gzy#55Q_g_639_review_for_report.xls time: 0.031480073928833 s 
##  
##  Loop gzy#568Q_g_639_review_for_report.xls time: 0.025629997253418 s 
##  
##  Loop gzy#56Q_g_639_review_for_report.xls time: 0.0274791717529297 s 
##  
##  Loop gzy#573Q_g_639_review_for_report.xls time: 0.0335450172424316 s 
##  
##  Loop gzy#591Q_g_639_review_for_report.xls time: 0.0355699062347412 s 
##  
##  Loop gzy#59Q_g_639_review_for_report.xls time: 0.0285429954528809 s 
##  
##  Loop gzy#601Q_g_639_review_for_report.xls time: 0.0485389232635498 s 
##  
##  Loop gzy#606Q_g_639_review_for_report.xls time: 0.0366849899291992 s 
##  
##  Loop gzy#619Q_g_639_review_for_report.xls time: 0.0424301624298096 s 
##  
##  Loop gzy#620Q_g_639_review_for_report.xls time: 0.0484368801116943 s 
##  
##  Loop gzy#632Q_g_639_review_for_report.xls time: 0.0355730056762695 s 
##  
##  Loop gzy#638Q_g_639_review_for_report.xls time: 0.0410878658294678 s 
##  
##  Loop gzy#653Q_g_639_review_for_report.xls time: 0.037074089050293 s 
##  
##  Loop gzy#669Q_g_639_review_for_report.xls time: 0.0326588153839111 s 
##  
##  Loop gzy#688Q_g_639_review_for_report.xls time: 0.103429079055786 s 
##  
##  Loop gzy#699Q_g_639_review_for_report.xls time: 0.031588077545166 s 
##  
##  Loop gzy#6Q_g_639_review_for_report.xls time: 0.019556999206543 s 
##  
##  Loop gzy#703Q_g_639_review_for_report.xls time: 0.0411009788513184 s 
##  
##  Loop gzy#714Q_g_639_review_for_report.xls time: 0.028425931930542 s 
##  
##  Loop gzy#718Q_g_639_review_for_report.xls time: 0.0385169982910156 s 
##  
##  Loop gzy#71Q_g_639_review_for_report.xls time: 0.0268590450286865 s 
##  
##  Loop gzy#722Q_g_639_review_for_report.xls time: 0.0295989513397217 s 
##  
##  Loop gzy#735Q_g_639_review_for_report.xls time: 0.0278708934783936 s 
##  
##  Loop gzy#743Q_g_639_review_for_report.xls time: 0.0279049873352051 s 
##  
##  Loop gzy#744Q_g_639_review_for_report.xls time: 0.0355300903320312 s 
##  
##  Loop gzy#748Q_g_639_review_for_report.xls time: 0.0363929271697998 s 
##  
##  Loop gzy#749Q_g_639_review_for_report.xls time: 0.035783052444458 s 
##  
##  Loop gzy#756Q_g_639_review_for_report.xls time: 0.0251269340515137 s 
##  
##  Loop gzy#758A1Q_g_639_review_for_report.xls time: 0.0318140983581543 s 
##  
##  Loop gzy#772Q_g_639_review_for_report.xls time: 0.0423810482025146 s 
##  
##  Loop gzy#78Q_g_639_review_for_report.xls time: 0.0186538696289062 s 
##  
##  Loop gzy#790Q_g_639_review_for_report.xls time: 0.0318808555603027 s 
##  
##  Loop gzy#795Q_g_639_review_for_report.xls time: 0.0245118141174316 s 
##  
##  Loop gzy#800Q_g_639_review_for_report.xls time: 0.0286071300506592 s 
##  
##  Loop gzy#811Q_g_639_review_for_report.xls time: 0.0278759002685547 s 
##  
##  Loop gzy#82Q_g_639_review_for_report.xls time: 0.020380973815918 s 
##  
##  Loop gzy#831Q_g_639_review_for_report.xls time: 0.0365421772003174 s 
##  
##  Loop gzy#832Q_g_639_review_for_report.xls time: 0.0337600708007812 s 
##  
##  Loop gzy#838Q_g_639_review_for_report.xls time: 0.0242860317230225 s 
##  
##  Loop gzy#839Q_g_639_review_for_report.xls time: 0.0288069248199463 s 
##  
##  Loop gzy#843Q_g_639_review_for_report.xls time: 0.0302529335021973 s 
##  
##  Loop gzy#844Q_g_639_review_for_report.xls time: 0.0230169296264648 s 
##  
##  Loop gzy#848Q_g_639_review_for_report.xls time: 0.0297150611877441 s 
##  
##  Loop gzy#84Q_g_639_review_for_report.xls time: 0.032397985458374 s 
##  
##  Loop gzy#858Q_g_639_review_for_report.xls time: 0.0325701236724854 s 
##  
##  Loop gzy#85Q_g_639_review_for_report.xls time: 0.0421891212463379 s 
##  
##  Loop gzy#865Q_g_639_review_for_report.xls time: 0.0314300060272217 s 
##  
##  Loop gzy#869Q_g_639_review_for_report.xls time: 0.0313520431518555 s 
##  
##  Loop gzy#890Q_g_639_review_for_report.xls time: 0.0269908905029297 s 
##  
##  Loop gzy#893Q_g_639_review_for_report.xls time: 0.0329298973083496 s 
##  
##  Loop gzy#895Q_g_639_review_for_report.xls time: 0.027446985244751 s 
##  
##  Loop gzy#89Q_g_639_review_for_report.xls time: 0.0206270217895508 s 
##  
##  Loop gzy#902Q_g_639_review_for_report.xls time: 0.0285041332244873 s 
##  
##  Loop gzy#923Q_g_639_review_for_report.xls time: 0.0380890369415283 s 
##  
##  Loop gzy#933Q_g_639_review_for_report.xls time: 0.0325620174407959 s 
##  
##  Loop gzy#945Q_g_639_review_for_report.xls time: 0.0309188365936279 s 
##  
##  Loop gzy#961Q_g_639_review_for_report.xls time: 0.0289218425750732 s 
##  
##  Loop gzy#973Q_g_639_review_for_report.xls time: 0.0303330421447754 s 
##  
##  Loop gzy#986Q_g_639_review_for_report.xls time: 0.0424778461456299 s 
##  
## Total: 189
head(MAF)
##   Druggable                     Report_C2 Tumor_VAF Chromosome Start_Position
## 1                 exon5,c.1227C>T,p.G409=     0.160          4       66356270
## 2         - exon11,c.1401_1404dup,p.L469*     0.193          5      112157674
## 3               exon16,c.3080A>G,p.Y1027C     0.112          5      112174371
## 4         -     exon16,c.4348C>T,p.R1450*     0.113          5      112175639
## 5                  exon8,c.797G>T,p.G266V     0.235         17        7577141
## 6         1      exon10,c.1633G>A,p.E545K     0.189          3      178936091
##   End_Position Reference_Allele Tumor_Seq_Allele2           FILTER Hugo_Symbol
## 1     66356270                G                 A             PASS       EPHA5
## 2    112157674                C             CAATG             PASS         APC
## 3    112174371                A                 G             PASS         APC
## 4    112175639                C                 T             PASS         APC
## 5      7577141                C                 A             PASS        TP53
## 6    178936095                G                 A clustered_events      PIK3CA
##        RefSeq          HGVSc        HGVSp t_ref_count t_alt_count depth
## 1 NM_004439.8      c.1227C>T    p.Gly409=         386          77   463
## 2 NM_000038.6 c.1401_1404dup    p.Leu469*        2099         499  2598
## 3 NM_000038.6      c.3080A>G p.Tyr1027Cys        1955         237  2192
## 4 NM_000038.6      c.4348C>T   p.Arg1450*        1680         219  1899
## 5 NM_000546.6       c.797G>T  p.Gly266Val        4247        1296  5543
## 6 NM_006218.4      c.1633G>A  p.Glu545Lys        2904         692  3596
##                               file_name Tumor_Sample_Barcode is_hotsgene
## 1 gzy#1007Q_g_639_review_for_report.xls                #1007       FALSE
## 2 gzy#1007Q_g_639_review_for_report.xls                #1007       FALSE
## 3 gzy#1007Q_g_639_review_for_report.xls                #1007       FALSE
## 4 gzy#1007Q_g_639_review_for_report.xls                #1007       FALSE
## 5 gzy#1007Q_g_639_review_for_report.xls                #1007       FALSE
## 6  gzy#101Q_g_639_review_for_report.xls                 #101       FALSE
##   mutation_class HGVSp_Short Variant_Type Variant_Classification
## 1            4.0     p.G409=          SNP                 Silent
## 2            2.0     p.L469*          INS        Frame_Shift_Ins
## 3            4.0    p.Y1027C          SNP      Missense_Mutation
## 4            2.0    p.R1450*          SNP      Nonsense_Mutation
## 5            4.0     p.G266V          SNP      Missense_Mutation
## 6            1.1     p.E545K          SNP      Missense_Mutation

5.1.1 MAF Visualization

5.1.1.1 mut_network

Gene Mutation Co_Occurence/Mutually_Exclusive analysis and visualization

#example
data("mutation_CRC")
mut_network(SNV=mutation_CRC,
                    top=20,pValue=0.01,customdata=NULL)
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
## Loading required package: ggplot2
## 
## Attaching package: 'tidyr'
## The following object is masked from 'package:igraph':
## 
##     crossing
## -Validating
## --Removed 35515 duplicated variants
## -Silent variants: 68601 
## -Summarizing
## --Possible FLAGS among top ten genes:
##   TTN
##   SYNE1
##   MUC16
## -Processing clinical data
## --Missing clinical data
## -Finished in 5.650s elapsed (5.610s cpu)

## $network
##       gene1  gene2       pValue oddsRatio    00    01    11    10         pAdj
##      <char> <char>        <num>     <num> <int> <int> <int> <int>        <num>
##   1:  ZFHX4   FAT4 2.025755e-18 7.9426605   415    67    58    45 3.683192e-17
##   2:   PCLO  OBSCN 1.867129e-15 8.1230346   444    57    43    41 3.111882e-14
##   3:   RYR3  OBSCN 4.627314e-15 8.0108583   445    58    42    40 7.118944e-14
##   4:   RYR3  CSMD3 9.368810e-15 7.9370909   447    56    41    41 1.319105e-13
##   5:   RYR3   PCLO 9.893290e-15 8.5288571   457    46    38    44 1.319105e-13
##  ---                                                                          
## 186:   TP53    TTN 8.660303e-01 0.9679041   130   109   155   191 8.837044e-01
## 187:   TP53   FAT3 9.041778e-01 0.9712145   205    34    48   298 9.179470e-01
## 188:  LRP1B   TP53 1.000000e+00 0.9812539   199   289    57    40 1.000000e+00
## 189:  LRP1B   KRAS 1.000000e+00 1.0022500   282   206    41    56 1.000000e+00
## 190: DNAH11   KRAS 1.000000e+00 1.0148352   288   210    37    50 1.000000e+00
##                   Event         pair event_ratio
##                  <char>       <char>      <char>
##   1:       Co_Occurence  FAT4, ZFHX4      58/112
##   2:       Co_Occurence  OBSCN, PCLO       43/98
##   3:       Co_Occurence  OBSCN, RYR3       42/98
##   4:       Co_Occurence  CSMD3, RYR3       41/97
##   5:       Co_Occurence   PCLO, RYR3       38/90
##  ---                                            
## 186: Mutually_Exclusive    TP53, TTN     155/300
## 187: Mutually_Exclusive   FAT3, TP53      48/332
## 188: Mutually_Exclusive  LRP1B, TP53      57/329
## 189:       Co_Occurence  KRAS, LRP1B      41/262
## 190:       Co_Occurence DNAH11, KRAS      37/260
## 
## $plot
## Warning: The following aesthetics were dropped during statistical transformation: xend
## and yend.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
##   the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
##   variable into a factor?
## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
## ℹ Consider using `geom_tile()` instead.
## Raster pixels are placed at uneven horizontal intervals and will be shifted
## ℹ Consider using `geom_tile()` instead.

5.1.1.2 maf_cor

This function analyzes mutation and clinical data correlation with MAF format data. It calculates the VAF (Variant Allele Frequency) for each gene and selects the top genes based on the specified criteria. Then, it calculates the correlation between the selected genes and a specified clinical variable. The function also generates a correlation plot to visualize the relationships between the genes and the clinical variable.

library(corrplot)
## corrplot 0.95 loaded
#example
data<-maf_cor(mutation_data=MAF,
           clin=NULL,
           gene=NULL,top=20,
           cin_col=NULL,corrplot_method=c("pie"))
## Time difference of 0.04077601 secs

## Registered S3 methods overwritten by 'ggcor':
##   method from     
##   &.gg   patchwork
##   *.gg   patchwork
## 
## ********************************************************
## Note: As of version 0.9.8, ggcor does not change the
##   default ggplot2 continuous fill scale anymore. To
##   recover the previous behavior, execute:
##     set_scale()
##   Instead of using the set_scale() function, we
##   recommend adding the 'scale_fill_*()' function
##   to the plot as needed.
## ********************************************************
## 
## Attaching package: 'ggcor'
## The following object is masked from 'package:stats':
## 
##     filter

head(data)
##       TP53 APC KRAS PIK3CA LRP1B RET FBXW7 SMAD4 ATM FAT1 KMT2C ERBB2 ARID1A
## #1007    1   3    0      0     0   0     0     0   0    0     0     0      0
## #101     1   2    1      1     0   0     1     0   0    0     0     0      0
## #1022    1   1    0      0     1   0     0     0   0    1     0     0      0
## #1028    1   2    0      0     0   0     0     0   0    0     0     0      0
## #1039    0   1    1      2     0   0     1     0   0    0     0     0      0
## #1049    1   0    1      1     0   0     0     0   0    0     0     0      0
##       ERBB4 EPHA5 BRAF BRCA2 CHD4 PRKDC AMER1
## #1007     0     1    0     0    0     0     0
## #101      0     0    0     0    0     0     0
## #1022     0     0    0     1    0     0     0
## #1028     0     1    0     0    0     0     0
## #1039     0     0    0     0    0     0     0
## #1049     0     0    0     0    0     0     0

5.1.1.3 path_mut_visual

This function visualizes the mutation rate of tumor pathways based on SNV and gene data.

#example
data("mutation_CRC")
data("gene_group_data")
data("pathway_data")
gene_of_interest<-colnames(gene_group_data)[[2]]
tumor_type="CRC TCGA"
color_vector=c("#757575", "#FF4040")
result <- path_mut_visual(result_data=NULL,
                           SNV = mutation_CRC,
                           gene = gene_of_interest,
                           Gene_group = gene_group_data,
                           Type = c("Wild", "Mut"),
                           pathway_gene_data = pathway_data,
                           tumor = tumor_type,
                           heatmap=TRUE,
                           heatmap_col=NULL,
                           color = color_vector,
                           test = "wilcox.test",ns=FALSE,
                           p_0.05=FALSE,p_0.01=FALSE,p_0.001=FALSE,p_0.0001=FALSE)
## Warning: package 'ggpubr' was built under R version 4.5.1
## Warning: package 'ComplexHeatmap' was built under R version 4.5.1
## Warning: package 'tidyHeatmap' was built under R version 4.5.1
##  1 : Oh yeah! Apoptosis 有组间显著性差异(p<0.0001) 
##  2 : Oh yeah! Cell cycle 有组间显著性差异(p<0.0001) 
##  3 : Oh yeah! Chromatin histone modifiers 有组间显著性差异(p<0.0001) 
##  4 : Oh yeah! Chromatin other 有组间显著性差异(p<0.0001) 
##  5 : Oh yeah! Chromatin SWI/SNF complex 有组间显著性差异(p<0.0001) 
##  6 : Oh yeah! Epigenetics DNA modifiers 有组间显著性差异(p<0.0001) 
##  7 : Oh yeah! Genome integrity 有组间显著性差异(p<0.0001) 
##  8 : Oh yeah! Histone modification 有组间显著性差异(p<0.0001) 
##  9 : Oh yeah! Immune signaling 有组间显著性差异(p<0.0001) 
##  10 : Oh yeah! MAPK signaling 有组间显著性差异(p<0.0001) 
##  11 : Oh yeah! Metabolism 有组间显著性差异(p<0.0001) 
##  12 : Oh yeah! NFKB signaling 有组间显著性差异(p<0.0001) 
##  13 : Oh yeah! NOTCH signaling 有组间显著性差异(p<0.0001) 
##  14 : Oh yeah! Other 有组间显著性差异(p<0.0001) 
##  15 : Oh yeah! Other signaling 有组间显著性差异(p<0.0001) 
##  16 : Oh yeah! PI3K signaling 有组间显著性差异(p<0.0001) 
##  17 : Oh yeah! Protein homeostasis/ubiquitination 有组间显著性差异(p<0.0001) 
##  18 : Oh yeah! RNA abundance 有组间显著性差异(p<0.0001) 
##  19 : Oh yeah! RTK signaling 有组间显著性差异(p<0.0001) 
##  20 : Oh yeah! Splicing 有组间显著性差异(p<0.0001) 
##  21 : Oh yeah! TGFB signaling 有组间显著性差异(p<0.0001) 
##  22 : Oh yeah! TOR signaling 有组间显著性差异(p<0.0001) 
##  23 : Oh yeah! Transcription factor 有组间显著性差异(p<0.0001) 
##  24 : Oh yeah! Wnt/B-catenin signaling 有组间显著性差异(p<0.0001) 
## Warning: Vectorized input to `element_text()` is not officially supported.
## ℹ Results may be unexpected or may change in future versions of ggplot2.
print(result$heatmap)

print(result$path_mut_plot)

5.2 基因表达数据.

5.2.1 Epression data处理

exp_geneIDtoSYMBOL

This function convert the gene name of expression data to SYMBOL.

#example
data("exp_raw")
exp_raw<-exp_geneIDtoSYMBOL(exp=exp_raw,genecoltype="ENTREZID")
## 
## 'select()' returned 1:1 mapping between keys and columns
## Warning in clusterProfiler::bitr(exp[[colnames(exp)[[1]]]], fromType =
## "ENTREZID", : 0.01% of input gene IDs are fail to map...
print(exp_raw[["gene"]][1:10,1:2])
##     ENTREZID    SYMBOL
## 1          1      A1BG
## 2     503538  A1BG-AS1
## 3      29974      A1CF
## 4          2       A2M
## 5     144571   A2M-AS1
## 6     144568     A2ML1
## 7  100874108 A2ML1-AS1
## 8  106478979 A2ML1-AS2
## 9          3     A2MP1
## 10    127550   A3GALT2
print(exp_raw[["data"]][1:10,1:5])
##     ENTREZID     SYMBOL TCGA-3L-AA1B-01A TCGA-4N-A93T-01A TCGA-4T-AA8H-01A
## 1          1       A1BG          -0.2986          -0.1453          -0.6105
## 2         10       NAT2           1.1236           0.2581           1.8891
## 3        100        ADA          -0.6428          -0.6827          -2.4357
## 4       1000       CDH2           0.7466          -0.6900          -0.9559
## 5      10000       AKT3           0.4759          -1.4576          -1.4286
## 6  100009613  LINC02584          -0.2082          -0.9571          -0.9571
## 7  100009667   POU5F1P5          -0.6915          -0.6915          -0.3571
## 8  100009668   POU5F1P6           0.8188          -0.5041          -0.6792
## 9  100009676 ZBTB11-AS1           0.4024          -0.8768          -0.1029
## 10     10001       MED6           0.3399          -1.4712          -0.7339

5.2.2 Epression Visualization

5.2.2.1 immu_visual

This function visualizes the mutation rate of immune score based on provided immune score or gene expression data.

#example
data("Gene_group_CRC1")
exp_CRC<-exp_raw$data
result<-immu_visual(im=NULL,exp=exp_CRC[,-1],
                    method = 'epic',
                    sample_group=Gene_group_CRC1,
                    tumor="CRC TCGA",heatmap=TRUE,
                    Type=c("Wild", "Mut"),
                    color=c("#757575", "#FF4040"),
                    geom_text=TRUE,
                    test = "wilcox.test")
## Loading required package: tibble
## 
## Attaching package: 'tibble'
## The following object is masked from 'package:igraph':
## 
##     as_data_frame
## Loading required package: survival
## Loading required package: patchwork
## Warning: package 'patchwork' was built under R version 4.5.1
## Loading required package: survminer
## 
## Attaching package: 'survminer'
## The following object is masked from 'package:survival':
## 
##     myeloma
## ==========================================================================
##   IOBR v0.99.8  Immuno-Oncology Biological Research 
##   For Tutorial: https://iobr.github.io/book/
##   For Help: https://github.com/IOBR/IOBR/issues
## 
##  If you use IOBR in published research, please cite:
##  DQ Zeng, YR Fang, WJ Qiu, ..., GC Yu*, WJ Liao*, (2024) 
##  IOBR2: Multidimensional Decoding Tumor Microenvironment for Immuno-Oncology Research. 
##  bioRxiv, 2024.01.13.575484; 
##  https://www.biorxiv.org/content/10.1101/2024.01.13.575484v2.full.pdf 
##  Higly Cited Paper and Hot Paper of WOS
## ==========================================================================
## 
## >>> Running EPIC
## Warning in IOBR::EPIC(bulk = eset, reference = ref, mRNA_cell = NULL, scaleExprs = TRUE): The optimization didn't fully converge for some samples:
## TCGA-A6-2680-01A; TCGA-A6-2681-01A; TCGA-A6-2683-01A; TCGA-A6-4107-01A; TCGA-A6-5661-01A; TCGA-A6-6782-01A; TCGA-AA-3488-01A; TCGA-AA-3511-01A; TCGA-AA-3516-01A; TCGA-AA-3526-01A; TCGA-AA-3529-01A; TCGA-AA-3530-01A; TCGA-AA-3548-01A; TCGA-AA-3560-01A; TCGA-AA-3562-01A; TCGA-AA-3684-01A; TCGA-AA-3692-01A; TCGA-AA-3697-01A; TCGA-AA-3712-01A; TCGA-AA-3815-01A; TCGA-AA-3821-01A; TCGA-AA-3831-01A; TCGA-AA-3837-01A; TCGA-AA-3850-01A; TCGA-AA-3862-01A; TCGA-AA-3977-01A; TCGA-AA-A00E-01A; TCGA-AA-A00J-01A; TCGA-AA-A00Z-01A; TCGA-AA-A010-01A; TCGA-AA-A01D-01A; TCGA-AA-A01R-01A; TCGA-AA-A02H-01A; TCGA-AA-A02R-01A; TCGA-AF-6655-01A; TCGA-AG-3578-01A; TCGA-AG-3584-01A; TCGA-AG-3599-01A; TCGA-AG-3608-01A; TCGA-AG-3727-01A; TCGA-AG-3887-01A; TCGA-AG-3896-01A; TCGA-AG-A00H-01A; TCGA-AG-A00Y-01A; TCGA-AG-A014-01A; TCGA-AG-A01J-01A; TCGA-AG-A02N-01A; TCGA-AH-6549-01A; TCGA-AZ-4308-01A; TCGA-AZ-4615-01A; TCGA-AZ-6598-01A; TCGA-AZ-6601-01A; TCGA-AZ-6603-01A; TCGA-AZ-6607-01A; TCGA-CA-5796-01A; TCGA-CK-4948-01B; TCGA-CK-4950-01A; TCGA-CK-5915-01A; TCGA-CM-4750-01A; TCGA-CM-5861-01A; TCGA-CM-6169-01A; TCGA-CM-6170-01A; TCGA-CM-6676-01A; TCGA-D5-5537-01A; TCGA-D5-6532-01A; TCGA-D5-6533-01A; TCGA-D5-6535-01A; TCGA-D5-6536-01A; TCGA-D5-6537-01A; TCGA-D5-6923-01A; TCGA-D5-6928-01A; TCGA-D5-6929-01A; TCGA-DC-6158-01A; TCGA-DC-6682-01A; TCGA-DM-A1D7-01A; TCGA-DM-A28H-01A; TCGA-DY-A0XA-01A; TCGA-EI-6512-01A; TCGA-EI-6513-01A; TCGA-EI-6883-01A; TCGA-EI-7004-01A; TCGA-F4-6570-01A; TCGA-F4-6807-01A; TCGA-F4-6808-01A; TCGA-F5-6465-01A; TCGA-F5-6810-01A; TCGA-F5-6861-01A; TCGA-F5-6864-01A; TCGA-G4-6293-01A; TCGA-G4-6297-01A; TCGA-G4-6302-01A; TCGA-G4-6310-01A; TCGA-G4-6314-01A; TCGA-G4-6321-01A; TCGA-G4-6322-01A; TCGA-G4-6588-01A; TCGA-G5-6233-01A; TCGA-QG-A5YW-01A; TCGA-WS-AB45-01A
##  - check fit.gof for the convergeCode and convergeMessage
## Warning in IOBR::EPIC(bulk = eset, reference = ref, mRNA_cell = NULL,
## scaleExprs = TRUE): mRNA_cell value unknown for some cell types: CAFs,
## Endothelial - using the default value of 0.4 for these but this might bias the
## true cell proportions from all cell types.
## # A tibble: 4,569 × 9
## # Groups:   method, Group [16]
##    SAMPLE_ID        Group method method_score      Q1      Q3     IQR LowerLimit
##    <chr>            <fct> <fct>         <dbl>   <dbl>   <dbl>   <dbl>      <dbl>
##  1 TCGA-3L-AA1B-01A Wild  CAFs       1.24e- 8 1.87e-6 0.0171  0.0171    -0.0256 
##  2 TCGA-3L-AA1B-01A Wild  CD4 T…     2.37e- 1 1.40e-7 0.142   0.142     -0.214  
##  3 TCGA-3L-AA1B-01A Wild  CD8 T…     4.62e- 3 2.31e-7 0.114   0.114     -0.171  
##  4 TCGA-3L-AA1B-01A Wild  Endot…     3.04e- 1 1.05e-6 0.159   0.159     -0.239  
##  5 TCGA-3L-AA1B-01A Wild  Macro…     7.11e-10 1.32e-7 0.0118  0.0118    -0.0177 
##  6 TCGA-3L-AA1B-01A Wild  NKcel…     1.94e- 8 9.59e-9 0.00460 0.00460   -0.00690
##  7 TCGA-3L-AA1B-01A Wild  other…     1.47e- 5 3.30e-1 0.920   0.590     -0.555  
##  8 TCGA-4N-A93T-01A Wild  Bcells     7.51e- 9 1.27e-7 0.0691  0.0691    -0.104  
##  9 TCGA-4N-A93T-01A Wild  CAFs       2.42e- 2 1.87e-6 0.0171  0.0171    -0.0256 
## 10 TCGA-4N-A93T-01A Wild  CD4 T…     1.14e- 1 1.40e-7 0.142   0.142     -0.214  
## # ℹ 4,559 more rows
## # ℹ 1 more variable: UpperLimit <dbl>
##  1 : Oopps! Bcells 没有组间显著性差异 
##  2 : Oopps! CAFs 没有组间显著性差异 
##  3 : Oopps! CD4 Tcells 没有组间显著性差异 
##  4 : Oopps! CD8 Tcells 没有组间显著性差异 
##  5 : Oopps! Endothelial 没有组间显著性差异 
##  6 : Oopps! Macrophages 没有组间显著性差异 
##  7 : Oopps! NKcells 没有组间显著性差异 
##  8 : Oopps! otherCells 没有组间显著性差异 
## Warning: Vectorized input to `element_text()` is not officially supported.
## ℹ Results may be unexpected or may change in future versions of ggplot2.
## Time difference of 22.01418 secs
print(result$imm_plot)

#当method为xCell,estimate和cibersort时,可以用以下方式增强可视化的可读性
result_imm01<-immu_visual(im=NULL,exp=exp_CRC[,-1],#
                          method = 'xCell',
                          sample_group=Gene_group_CRC1,
                          tumor="CRC TCGA",heatmap=TRUE,
                          Type=c("Wild", "Mut"),
                          color=c("#757575", "#FF4040"),
                          geom_text=TRUE,
                          test = "wilcox.test")
## [1] "Num. of genes: 10426"
## ℹ GSVA version 2.3.1
## ! No annotation metadata available in the input expression data object
## ! Attempting to directly match identifiers in expression data to gene sets
## ℹ Calculating  ssGSEA scores for 489 gene sets
## ℹ Calculating ranks
## ℹ Calculating rank weights
## Calculating ssGSEA scores ■■■■                              11% | ETA: 10s
## Calculating ssGSEA scores ■■■■■■■■■■■                       32% | ETA:  5s
## Calculating ssGSEA scores ■■■■■■■■■■■■■■■■■■■■■■■■■■■■      89% | ETA:  1s
## Calculating ssGSEA scores ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  100% | ETA:  0s
## ✔ Calculations finished
## # A tibble: 37,958 × 9
## # Groups:   method, Group [128]
##    SAMPLE_ID     Group method method_score       Q1       Q3      IQR LowerLimit
##    <chr>         <fct> <fct>         <dbl>    <dbl>    <dbl>    <dbl>      <dbl>
##  1 TCGA-3L-AA1B… Wild  Adipo…     3.00e-19 2.20e-21 3.54e- 2 3.54e- 2  -5.31e- 2
##  2 TCGA-3L-AA1B… Wild  Astro…     4.59e- 3 0        2.29e- 1 2.29e- 1  -3.43e- 1
##  3 TCGA-3L-AA1B… Wild  B-cel…     8.24e- 1 6.59e-18 4.13e- 1 4.13e- 1  -6.19e- 1
##  4 TCGA-3L-AA1B… Wild  Basop…     2.03e-16 5.38e-17 3.75e- 1 3.75e- 1  -5.63e- 1
##  5 TCGA-3L-AA1B… Wild  CD4+ …     4.93e- 2 2.47e-18 1.96e- 1 1.96e- 1  -2.93e- 1
##  6 TCGA-3L-AA1B… Wild  CD4+ …     2.30e- 1 1.14e- 1 5.86e- 1 4.72e- 1  -5.94e- 1
##  7 TCGA-3L-AA1B… Wild  CD8+ …     1.20e- 1 3.19e- 2 3.29e- 1 2.97e- 1  -4.14e- 1
##  8 TCGA-3L-AA1B… Wild  CD8+ …     1.31e-17 6.24e-20 2.66e- 1 2.66e- 1  -3.98e- 1
##  9 TCGA-3L-AA1B… Wild  CD8+ …     3.54e-17 0        2.90e-17 2.90e-17  -4.35e-17
## 10 TCGA-3L-AA1B… Wild  CD8+ …     0        7.00e-18 1.12e- 1 1.12e- 1  -1.68e- 1
## # ℹ 37,948 more rows
## # ℹ 1 more variable: UpperLimit <dbl>
##  1 : Oh yeah! Adipocytes 有组间显著性差异(p<0.05) 
##  2 : Oopps! Astrocytes 没有组间显著性差异 
##  3 : Oh yeah! B-cells 有组间显著性差异(p<0.05) 
##  4 : Oh yeah! Basophils 有组间显著性差异(p<0.01) 
##  5 : Oopps! CD4+ T-cells 没有组间显著性差异 
##  6 : Oopps! CD4+ Tcm 没有组间显著性差异 
##  7 : Oopps! CD4+ Tem 没有组间显著性差异 
##  8 : Oh yeah! CD4+ memory T-cells 有组间显著性差异(p<0.01) 
##  9 : Oopps! CD4+ naive T-cells 没有组间显著性差异 
##  10 : Oh yeah! CD8+ T-cells 有组间显著性差异(p<0.001) 
##  11 : Oh yeah! CD8+ Tcm 有组间显著性差异(p<0.0001) 
##  12 : Oh yeah! CD8+ Tem 有组间显著性差异(p<0.0001) 
##  13 : Oopps! CD8+ naive T-cells 没有组间显著性差异 
##  14 : Oopps! CLP 没有组间显著性差异 
##  15 : Oopps! CMP 没有组间显著性差异 
##  16 : Oopps! Chondrocytes 没有组间显著性差异 
##  17 : Oh yeah! Class-switched memory B-cells 有组间显著性差异(p<0.05) 
##  18 : Oopps! DC 没有组间显著性差异 
##  19 : Oopps! Endothelial cells 没有组间显著性差异 
##  20 : Oh yeah! Eosinophils 有组间显著性差异(p<0.05) 
##  21 : Oopps! Epithelial cells 没有组间显著性差异 
##  22 : Oh yeah! Erythrocytes 有组间显著性差异(p<0.05) 
##  23 : Oh yeah! Fibroblasts 有组间显著性差异(p<0.01) 
##  24 : Oh yeah! GMP 有组间显著性差异(p<0.05) 
##  25 : Oh yeah! HSC 有组间显著性差异(p<0.01) 
##  26 : Oh yeah! Hepatocytes 有组间显著性差异(p<0.01) 
##  27 : Oopps! Keratinocytes 没有组间显著性差异 
##  28 : Oh yeah! MEP 有组间显著性差异(p<0.01) 
##  29 : Oopps! MPP 没有组间显著性差异 
##  30 : Oopps! MSC 没有组间显著性差异 
##  31 : Oh yeah! Macrophages 有组间显著性差异(p<0.01) 
##  32 : Oh yeah! Macrophages M1 有组间显著性差异(p<0.001) 
##  33 : Oh yeah! Macrophages M2 有组间显著性差异(p<0.05) 
##  34 : Oh yeah! Mast cells 有组间显著性差异(p<0.05) 
##  35 : Oopps! Megakaryocytes 没有组间显著性差异 
##  36 : Oopps! Melanocytes 没有组间显著性差异 
##  37 : Oopps! Memory B-cells 没有组间显著性差异 
##  38 : Oopps! Mesangial cells 没有组间显著性差异 
##  39 : Oh yeah! Monocytes 有组间显著性差异(p<0.05) 
##  40 : Oopps! Myocytes 没有组间显著性差异 
##  41 : Oh yeah! NK cells 有组间显著性差异(p<0.001) 
##  42 : Oh yeah! NKT 有组间显著性差异(p<0.01) 
##  43 : Oh yeah! Neurons 有组间显著性差异(p<0.01) 
##  44 : Oopps! Neutrophils 没有组间显著性差异 
##  45 : Oopps! Osteoblast 没有组间显著性差异 
##  46 : Oopps! Pericytes 没有组间显著性差异 
##  47 : Oopps! Plasma cells 没有组间显著性差异 
##  48 : Oopps! Platelets 没有组间显著性差异 
##  49 : Oopps! Preadipocytes 没有组间显著性差异 
##  50 : Oopps! Sebocytes 没有组间显著性差异 
##  51 : Oopps! Skeletal muscle 没有组间显著性差异 
##  52 : Oopps! Smooth muscle 没有组间显著性差异 
##  53 : Oopps! Tgd cells 没有组间显著性差异 
##  54 : Oh yeah! Th1 cells 有组间显著性差异(p<0.05) 
##  55 : Oh yeah! Th2 cells 有组间显著性差异(p<0.05) 
##  56 : Oopps! Tregs 没有组间显著性差异 
##  57 : Oh yeah! aDC 有组间显著性差异(p<0.01) 
##  58 : Oopps! cDC 没有组间显著性差异 
##  59 : Oopps! iDC 没有组间显著性差异 
##  60 : Oopps! ly Endothelial cells 没有组间显著性差异 
##  61 : Oopps! mv Endothelial cells 没有组间显著性差异 
##  62 : Oopps! naive B-cells 没有组间显著性差异 
##  63 : Oopps! pDC 没有组间显著性差异 
##  64 : Oopps! pro B-cells 没有组间显著性差异 
## Warning: Vectorized input to `element_text()` is not officially supported.
## ℹ Results may be unexpected or may change in future versions of ggplot2.
## Time difference of 15.16357 secs
result_imm01_1<-immu_visual(im=result_imm01$imm_data[,c(1,67,68,69)],exp=NULL,#
                            method = 'xCell',
                            sample_group=Gene_group_CRC1,
                            tumor=" ",heatmap=TRUE,
                            Type=c("Wild", "Mut"),
                            color=c("#757575", "#FF4040"),
                            geom_text=TRUE,
                            test = "wilcox.test")
## # A tibble: 1,846 × 9
## # Groups:   method, Group [6]
##    SAMPLE_ID   Group method method_score    Q1    Q3   IQR LowerLimit UpperLimit
##    <chr>       <fct> <fct>         <dbl> <dbl> <dbl> <dbl>      <dbl>      <dbl>
##  1 TCGA-3L-AA… Wild  Immun…      1.03    0.304 1.32  1.02      -1.22        2.85
##  2 TCGA-3L-AA… Wild  Strom…      0.517   0.101 0.479 0.378     -0.465       1.05
##  3 TCGA-3L-AA… Wild  Micro…      1.55    0.514 1.71  1.20      -1.28        3.50
##  4 TCGA-4N-A9… Wild  Immun…      0.725   0.304 1.32  1.02      -1.22        2.85
##  5 TCGA-4N-A9… Wild  Strom…      0.0237  0.101 0.479 0.378     -0.465       1.05
##  6 TCGA-4N-A9… Wild  Micro…      0.749   0.514 1.71  1.20      -1.28        3.50
##  7 TCGA-4T-AA… Wild  Immun…      0.371   0.304 1.32  1.02      -1.22        2.85
##  8 TCGA-4T-AA… Wild  Strom…      0.00474 0.101 0.479 0.378     -0.465       1.05
##  9 TCGA-4T-AA… Wild  Micro…      0.376   0.514 1.71  1.20      -1.28        3.50
## 10 TCGA-5M-AA… Wild  Immun…      0.0676  0.304 1.32  1.02      -1.22        2.85
## # ℹ 1,836 more rows
##  1 : Oh yeah! ImmuneScore 有组间显著性差异(p<0.01) 
##  2 : Oopps! StromaScore 没有组间显著性差异 
##  3 : Oopps! MicroenvironmentScore 没有组间显著性差异 
## Warning: Vectorized input to `element_text()` is not officially supported.
## ℹ Results may be unexpected or may change in future versions of ggplot2.
## Time difference of 0.573478 secs
result_imm01$imm_plot+ 
  theme(legend.position = "right")+
  annotation_custom(
    grob = ggplotGrob(result_imm01_1$imm_plot+ 
                        theme(legend.position = "none",
                              axis.title.y=element_blank(),
                              axis.text.x = element_text(face = "plain", angle = 30,
                                                         size=8,
                                                         hjust = 1))
    ),
    xmin = 30.5,
    xmax = 53.5,
    ymin = 2.0,
    ymax = 3.5
  ) 

5.2.2.2 limma.dif.visual

This function visualizes the Differential Gene Expression Data.

#example
library(limma)
library(DESeq2)
## Loading required package: S4Vectors
## Loading required package: stats4
## Loading required package: BiocGenerics
## Loading required package: generics
## 
## Attaching package: 'generics'
## The following objects are masked from 'package:igraph':
## 
##     components, union
## The following object is masked from 'package:dplyr':
## 
##     explain
## The following objects are masked from 'package:base':
## 
##     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
##     setequal, union
## 
## Attaching package: 'BiocGenerics'
## The following object is masked from 'package:limma':
## 
##     plotMA
## The following objects are masked from 'package:ggcor':
## 
##     ncols, nrows
## The following objects are masked from 'package:igraph':
## 
##     normalize, path
## The following object is masked from 'package:dplyr':
## 
##     combine
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
##     as.data.frame, basename, cbind, colnames, dirname, do.call,
##     duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
##     mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
##     unsplit, which.max, which.min
## 
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:clusterProfiler':
## 
##     rename
## The following object is masked from 'package:tidyr':
## 
##     expand
## The following objects are masked from 'package:dplyr':
## 
##     first, rename
## The following object is masked from 'package:utils':
## 
##     findMatches
## The following objects are masked from 'package:base':
## 
##     I, expand.grid, unname
## Loading required package: IRanges
## 
## Attaching package: 'IRanges'
## The following object is masked from 'package:clusterProfiler':
## 
##     slice
## The following objects are masked from 'package:dplyr':
## 
##     collapse, desc, slice
## The following object is masked from 'package:grDevices':
## 
##     windows
## Loading required package: GenomicRanges
## Loading required package: GenomeInfoDb
## Warning: package 'GenomeInfoDb' was built under R version 4.5.1
## Loading required package: SummarizedExperiment
## Loading required package: MatrixGenerics
## Loading required package: matrixStats
## 
## Attaching package: 'matrixStats'
## The following object is masked from 'package:dplyr':
## 
##     count
## 
## Attaching package: 'MatrixGenerics'
## The following objects are masked from 'package:matrixStats':
## 
##     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
##     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
##     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
##     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
##     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
##     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
##     colWeightedMeans, colWeightedMedians, colWeightedSds,
##     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
##     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
##     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
##     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
##     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
##     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
##     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
##     rowWeightedSds, rowWeightedVars
## Loading required package: Biobase
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.
## 
## Attaching package: 'Biobase'
## The following object is masked from 'package:MatrixGenerics':
## 
##     rowMedians
## The following objects are masked from 'package:matrixStats':
## 
##     anyMissing, rowMedians
## The following object is masked from 'package:Hmisc':
## 
##     contents
library(edgeR)
## Warning: package 'edgeR' was built under R version 4.5.1
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0          ✔ purrr     1.1.0.9000
## ✔ lubridate 1.9.4          ✔ readr     2.1.5
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ lubridate::%--%()         masks igraph::%--%()
## ✖ lubridate::%within%()     masks IRanges::%within%()
## ✖ BiocGenerics::Position()  masks ggplot2::Position(), base::Position()
## ✖ tibble::as_data_frame()   masks igraph::as_data_frame(), dplyr::as_data_frame()
## ✖ IRanges::collapse()       masks dplyr::collapse()
## ✖ Biobase::combine()        masks BiocGenerics::combine(), dplyr::combine()
## ✖ purrr::compose()          masks igraph::compose()
## ✖ matrixStats::count()      masks dplyr::count()
## ✖ tidyr::crossing()         masks igraph::crossing()
## ✖ IRanges::desc()           masks dplyr::desc()
## ✖ S4Vectors::expand()       masks tidyr::expand()
## ✖ clusterProfiler::filter() masks ggcor::filter(), dplyr::filter(), stats::filter()
## ✖ S4Vectors::first()        masks dplyr::first()
## ✖ dplyr::lag()              masks stats::lag()
## ✖ purrr::reduce()           masks GenomicRanges::reduce(), IRanges::reduce()
## ✖ S4Vectors::rename()       masks clusterProfiler::rename(), dplyr::rename()
## ✖ lubridate::second()       masks S4Vectors::second()
## ✖ lubridate::second<-()     masks S4Vectors::second<-()
## ✖ purrr::simplify()         masks clusterProfiler::simplify(), igraph::simplify()
## ✖ IRanges::slice()          masks clusterProfiler::slice(), dplyr::slice()
## ✖ Hmisc::src()              masks dplyr::src()
## ✖ Hmisc::summarize()        masks dplyr::summarize()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(ggrepel)
library(ComplexHeatmap)
library(dplyr)
library(org.Hs.eg.db)
## Loading required package: AnnotationDbi
## 
## Attaching package: 'AnnotationDbi'
## 
## The following object is masked from 'package:clusterProfiler':
## 
##     select
## 
## The following object is masked from 'package:dplyr':
## 
##     select
library(clusterProfiler)
library(enrichplot)
## Warning: package 'enrichplot' was built under R version 4.5.1
## enrichplot v1.28.4 Learn more at https://yulab-smu.top/contribution-knowledge-mining/
## 
## Please cite:
## 
## T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan,
## X Fu, S Liu, X Bo, and G Yu. clusterProfiler 4.0: A universal
## enrichment tool for interpreting omics data. The Innovation. 2021,
## 2(3):100141
## 
## Attaching package: 'enrichplot'
## 
## The following object is masked from 'package:ggpubr':
## 
##     color_palette
library(base)
exp_CRC<-exp_raw$data
result<-limma.dif.visual(exprdata=exp_CRC[,-1],
                           pdata=Gene_group_CRC1,datatype="TPM",
                           Type=c("Wild", "Mut"),diff_method="limma",
                           contrastfml="Wild - Mut",
                           tumor="CRC TCGA",
                           P.Value=0.05,
                           logFC=0.5,tidyHeatmap=TRUE,
                           color= NULL,
                           ann_colors = list(regulate = c(Down = "#1B9E77", Up = "#D95F02"),
                           PREX2 = c(Wild = "#757575", Mut = "#FF4040")),
                           Regulate=c("Up","Down"),GO=TRUE,GO.plot="dotplot",split=TRUE,
                           KEGG=TRUE,KEGG.plot="dotplot",rel_heights= c(1.5, 0.5, 1))
## 'data.frame':    35487 obs. of  8 variables:
##  $ logFC    : num  -0.919 -1.096 -1.065 -1.086 -0.826 ...
##  $ AveExpr  : num  -2.94e-01 -1.03e-01 -4.10e-01 -1.76e-06 -4.47e-01 ...
##  $ t        : num  -8.57 -8.05 -7.78 -7.7 -7.54 ...
##  $ P.Value  : num  8.04e-17 4.11e-15 2.98e-14 5.21e-14 1.60e-13 ...
##  $ adj.P.Val: num  2.87e-12 7.35e-11 3.55e-10 4.65e-10 1.14e-09 ...
##  $ B        : num  27.2 23.5 21.6 21.1 20 ...
##  $ regulate : chr  "Down (1491 genes)" "Down (1491 genes)" "Down (1491 genes)" "Down (1491 genes)" ...
##  $ gene     : chr  "DLGAP1-AS5" "HMSD" "TRDV1" "MBP" ...
##   variables              idd target_group   value condiction
## 1      A1CF TCGA-AA-3692-01A         Wild -0.0368         Up
## 2      A1CF TCGA-WS-AB45-01A         Wild -0.3920         Up
## 3      A1CF TCGA-AA-3561-01A         Wild  0.2775         Up
## 4      A1CF TCGA-D5-6930-01A         Wild -0.5037         Up
## 5      A1CF TCGA-AG-3582-01A         Wild  0.0393         Up
## 6      A1CF TCGA-AA-3516-01A         Wild -1.3484         Up
## 'select()' returned 1:1 mapping between keys and columns
## using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).
## 
## preparing geneSet collections...
## GSEA analysis...
## Warning in fgseaMultilevel(pathways = pathways, stats = stats, minSize =
## minSize, : For some pathways, in reality P-values are less than 1e-10. You can
## set the `eps` argument to zero for better estimation.
## leading edge analysis...
## done...
## Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"...
## Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/hsa"...
## using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).
## 
## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
print(result$volcano_plot)

print(result$heatmap)

print(result$GO_plot)

print(result$KEGG_plot)

5.2.2.3 gene.mut_exp

This function visualizes the Gene Expression Difference with gene mutated status.

#example
#gene.mut_exp(mutation_data=mutation_CRC,exp=exp_CRC[,-1],colnum=2,gene=NULL,top=10,visual=TRUE,
#                      test_type= "parametric",title= "CRC TCGA",test="wilcox.test",only_red=TRUE,gene_vaf=F,
#                     color_0.05= "#FF34B3",color_0.01="#9400D3",color_0.001="#CD3700",color_0.0001="red",bar=T)

5.2.2.4 ScRNA

该函数用于处理单细胞RNA测序数据,包含数据标准化、特征选择和降维分析。

#example

5.3 other

5.3.1 HRDscore

Determining genomic scar score (telomeric allelic imbalance, loss-off heterozigosity, large-scle transitions), signs of homologous recombination deficiency

#example
root_dir <-system.file("HRD", package = "Mypackage")
data("metadata")
HRD<-HRDscore(dirpath=root_dir,file_id_map=metadata,ploidy=NULL,reference="grch38")
## Warning: package 'data.table' was built under R version 4.5.1
## 
## Attaching package: 'data.table'
## The following objects are masked from 'package:lubridate':
## 
##     hour, isoweek, mday, minute, month, quarter, second, wday, week,
##     yday, year
## The following object is masked from 'package:purrr':
## 
##     transpose
## The following object is masked from 'package:SummarizedExperiment':
## 
##     shift
## The following object is masked from 'package:GenomicRanges':
## 
##     shift
## The following object is masked from 'package:IRanges':
## 
##     shift
## The following objects are masked from 'package:S4Vectors':
## 
##     first, second
## The following objects are masked from 'package:dplyr':
## 
##     between, first, last
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI 
## Determining HRD-LOH, LST, TAI
head(HRD)
##       SampleID LOH TAI LST HRDsum  HRD
## 1 TCGA-24-1930   7  21  23     51 HRD+
## 2 TCGA-25-1321  14  20  20     54 HRD+
## 3 TCGA-04-1525   3  19   7     29 HRD-
## 4 TCGA-23-1122   5  33  28     66 HRD+
## 5 TCGA-09-1667   7  26  11     44 HRD+
## 6 TCGA-61-1895  21  25  34     80 HRD+

5.3.2 sig_Heatmap

heatmap() takes a tbl object and easily produces a ComplexHeatmap plot, with integration with tibble and dplyr frameworks.

#example
data("input1")
data("condiction")
feas1<-colnames(input1)[3:102]
sig_Heatmap(input = input1, features = feas1,ID ="SAMPLE_ID",show_plot=F,
            condiction=condiction,id_condiction=colnames(condiction)[[1]],col_condiction=colnames(condiction)[[2]],
             cols_group=c("#757575","#FF4040"),row_group=c("red","green"),
             legend_show=TRUE,column_title_size=10,row_title_size=8,
             heatmap_col=NULL,
             #heatmap_col=c("#0505FA", "#FFFFFF", "#FA050D"),
             group = "PREX2",row_title="Regulate", scale = TRUE,name="Expression")
##   variables             idd target_group   value condiction
## 1     ABCC5 TCGA-IN-A6RN-01         Wild  1.0065         Up
## 2     ABCC5 TCGA-BR-8687-01         Wild -0.3104         Up
## 3     ABCC5 TCGA-VQ-AA6I-01         Wild  0.2197         Up
## 4     ABCC5 TCGA-CD-A4MI-01          Mut -0.4678         Up
## 5     ABCC5 TCGA-IN-A6RL-01         Wild  2.2081         Up
## 6     ABCC5 TCGA-MX-A666-01         Wild  0.7078         Up

5.3.3 ggsurvplots

Drawing Survival Curves Using ggplot2

#example
data("clin_TCGA")
ggsurvplots(data = clin_TCGA, conf.int = FALSE,time_col = "PFS_MONTHS",
                 status_col = "PFS_STATUS", group_col = "Status", pvalue_table = TRUE,
                 palette = ggsci::pal_ucscgb()(4), risk.table = FALSE, title = NULL,
                 legend.labs = c("no KRAS or TP53", "TP53", "KRAS", "KRAS&TP53"),
                 xlab = "PFS_MONTHS", ylab = "Survival probability",
                 surv.median.line="hv",surv.scale="default",legend=FALSE)
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:Biobase':
## 
##     combine
## The following object is masked from 'package:BiocGenerics':
## 
##     combine
## The following object is masked from 'package:dplyr':
## 
##     combine
## [1] "#00FF00FF" "#FF9900FF" "#FF0000FF" "#FFCC00FF"
## Warning in geom_segment(aes(x = 0, y = max(y2), xend = max(x1), yend = max(y2)), : All aesthetics have length 1, but the data has 4 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
##   a single row.
## All aesthetics have length 1, but the data has 4 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
##   a single row.

5.3.4 cox_forest

Cox Proportional Hazards Univariate and Multivariate Forest Plot Generator

#example
data("aa")
cox   =    cox_forest(data=aa,
                          time_col = "PFI.time",
                          status_col = "PFI",
                          Univariate=T,
                          univar_predictors=colnames(aa)[c(5:7,18:22,34,31)],
                          Multivariate=T,
                          multivar_predictors = colnames(aa)[c(5:7,18:21,33,31)],
                          show_plots = T,xticks1=NULL,#c(0,0.25,0.5,0.75,1.00,1.25,1.5,6.5,11),
                          xticks2=NULL,#c(0,0.25,0.5,0.75,1.00,2,2.5,6,15),
                          title_univar = "PFI Univariate",
                          title_multivar = "PFI Multivariate",
                          use_baseline_table = TRUE,all=F,forestplot=F,
                          ci_pch=16,ci_col="darkred",ci_line="lightgreen",zero_col="#e22e2a",
                         log2=T,footnote=paste("\nHRD5: with HRD value adjusted(median)", "HRD adjusted =   LST-15.5*ploidy+LOH+TAI ",sep = "\n"))
## Loading required package: forestplot
## Warning: package 'forestplot' was built under R version 4.5.1
## Loading required package: checkmate
## 
## Attaching package: 'checkmate'
## The following object is masked from 'package:Biobase':
## 
##     anyMissing
## The following object is masked from 'package:matrixStats':
## 
##     anyMissing
## Loading required package: abind
## Loading required package: forestploter
## Loading required package: tableone
## Loading required package: plyr
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following object is masked from 'package:purrr':
## 
##     compact
## The following object is masked from 'package:matrixStats':
## 
##     count
## The following object is masked from 'package:IRanges':
## 
##     desc
## The following object is masked from 'package:S4Vectors':
## 
##     rename
## The following objects are masked from 'package:clusterProfiler':
## 
##     arrange, mutate, rename, summarise
## The following object is masked from 'package:ggpubr':
## 
##     mutate
## The following object is masked from 'package:ggcor':
## 
##     mutate
## The following objects are masked from 'package:Hmisc':
## 
##     is.discrete, summarize
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
##                                
##                                 level      Overall      
##   n                                          604        
##   GRADE (%)                     G1             6 ( 1.0) 
##                                 G2            78 (12.9) 
##                                 G3           503 (83.3) 
##                                 G4             1 ( 0.2) 
##                                 GB             2 ( 0.3) 
##                                 GX            10 ( 1.7) 
##                                 missing        4 ( 0.7) 
##   AGE (mean (SD))                          59.63 (11.47)
##   AJCC_Stage (%)                I             17 ( 2.8) 
##                                 II            33 ( 5.5) 
##                                 III          460 (76.2) 
##                                 IV            89 (14.7) 
##                                 missing        5 ( 0.8) 
##   TMB_NONSYNONYMOUS (mean (SD))             1.63 (0.91) 
##   LOH (mean (SD))                          12.19 (10.39)
##   TAI (mean (SD))                          22.37 (7.21) 
##   LST (mean (SD))                          23.50 (22.60)
##   HRDsum (mean (SD))                       58.06 (35.07)
##   HRD6 (%)                      BRCA+&HRD+    11 ( 1.8) 
##                                 BRCA+&HRD-    12 ( 2.0) 
##                                 BRCA-&HRD+   285 (47.2) 
##                                 BRCA-&HRD-   275 (45.5) 
##                                 missing       21 ( 3.5) 
##   adjusted_HRDsum (mean (SD))              27.72 (44.41)
## Warning in coxph.fit(X, Y, istrat, offset, init, control, weights = weights, :
## Loglik converged before variable 8,9,10 ; coefficient may be infinite.
##                                
##                                 level    Overall      
##   n                                        604        
##   GRADE (%)                     G1           6 ( 1.0) 
##                                 G2          78 (12.9) 
##                                 G3         503 (83.3) 
##                                 G4           1 ( 0.2) 
##                                 GB           2 ( 0.3) 
##                                 GX          10 ( 1.7) 
##                                 missing      4 ( 0.7) 
##   AGE (mean (SD))                        59.63 (11.47)
##   AJCC_Stage (%)                I           17 ( 2.8) 
##                                 II          33 ( 5.5) 
##                                 III        460 (76.2) 
##                                 IV          89 (14.7) 
##                                 missing      5 ( 0.8) 
##   TMB_NONSYNONYMOUS (mean (SD))           1.63 (0.91) 
##   LOH (mean (SD))                        12.19 (10.39)
##   TAI (mean (SD))                        22.37 (7.21) 
##   LST (mean (SD))                        23.50 (22.60)
##   HRD5 (%)                      Negative   275 (45.5) 
##                                 Positive   308 (51.0) 
##                                 missing     21 ( 3.5) 
##   adjusted_HRDsum (mean (SD))            27.72 (44.41)
print(cox[["univariate"]][["uniforest"]])

#grid::grid.newpage()
print(cox[["multivariate"]][["multiforest"]])

5.3.5 GSEAplot2

GSEA plot that mimic the plot generated by broad institute’s GSEA software

#example
GSEAplot2(result$GO,geneSetID=if(length(result$GO@result[["ID"]])>10) c(1:10) else  c(1:c(1:length(result$GO@result[["ID"]]))),ES_geom = "line",legend.position ="none",pvalue_table=FALSE,
                            title = paste("GO enrichment"),rel_heights = c(4, 3, 2),
                            base_size = 20,Type=c("Wild", "Mut"))
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
##  immune system process (pvalue = 0,NES = -3.052512) 
##  immune response (pvalue = 0,NES = -2.953593) 
##  defense response to symbiont (pvalue = 0,NES = -2.902195) 
##  defense response to other organism (pvalue = 0,NES = -2.802582) 
##  response to external biotic stimulus (pvalue = 0,NES = -2.79815) 
##  response to other organism (pvalue = 0,NES = -2.79815) 
##  response to biotic stimulus (pvalue = 0,NES = -2.765313) 
##  biological process involved in interspecies interaction between organisms (pvalue = 0,NES = -2.77826) 
##  innate immune response (pvalue = 1e-06,NES = -2.809501) 
##  regulation of immune response (pvalue = 1e-06,NES = -2.709404) 

6 References

  1. Yao J, Sun Q, Wu H, et al. Decoding the molecular landscape: HER2 and PD-L1 in advanced gastric cancer. Front Immunol. 2025;16:1567308. doi:10.3389/fimmu.2025.1567308
  2. Qiu Q, Tan D, Chen Q, et al. Clinical implications of PD-L1 expression and pathway-related molecular subtypes in advanced Asian colorectal cancer patients. Am J Cancer Res. 2024;14(2):796-808. doi:10.62347/FSSF9938
  3. Ding W, Yang P, Zhao X, et al. Unraveling EGFR-TKI resistance in lung cancer with high PD-L1 or TMB in EGFR-sensitive mutations. Respir Res. 2024;25(1):40. doi:10.1186/s12931-023-02656-3
  4. Peng H, Ying J, Zang J, et al. Specific Mutations in APC, with Prognostic Implications in Metastatic Colorectal Cancer. Cancer Res Treat. 2023;55(4):1270-1280. doi:10.4143/crt.2023.415
  5. Jiang Y, Mai G, Zhao X, et al. Molecular characterization and prognostic implications of KRAS mutations in pancreatic cancer patients: insights from multi-cohort analysis. NPJ Precis Oncol. 2025;9(1):299. Published 2025 Aug 22. doi:10.1038/s41698-025-01087-1
  6. Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487(7407):330-337. Published 2012 Jul 18. doi:10.1038/nature11252